Construction and evaluation of neonatal respiratory failure risk prediction model for neonatal respiratory distress syndrome

Background Neonatal respiratory distress syndrome (NRDS) is a common respiratory disease in preterm infants, often accompanied by respiratory failure. The aim of this study was to establish and validate a nomogram model for predicting the probability of respiratory failure in NRDS patients. Methods Patients diagnosed with NRDS were extracted from the MIMIC-iv database. The patients were randomly assigned to a training and a validation cohort. Univariate and stepwise Cox regression analyses were used to determine the prognostic factors of NRDS. A nomogram containing these factors was established to predict the incidence of respiratory failure in NRDS patients. The area under the receiver operating characteristic curve (AUC), receiver operating characteristic curve (ROC), calibration curves and decision curve analysis were used to determine the effectiveness of this model. Results The study included 2,705 patients with NRDS. Univariate and multivariate stepwise Cox regression analysis showed that the independent risk factors for respiratory failure in NRDS patients were gestational age, pH, partial pressure of oxygen (PO2), partial pressure of carbon dioxide (PCO2), hemoglobin, blood culture, infection, neonatal intracranial hemorrhage, Pulmonary surfactant (PS), parenteral nutrition and respiratory support. Then, the nomogram was constructed and verified. Conclusions This study identified the independent risk factors of respiratory failure in NRDS patients and used them to construct and evaluate respiratory failure risk prediction model for NRDS. The present findings provide clinicians with the judgment of patients with respiratory failure in NRDS and help clinicians to identify and intervene in the early stage.


Introduction
Neonatal respiratory distress syndrome (NRDS) is the most common respiratory system disease in premature babies, particularly those born before 28 weeks of gestation [1,2].It is caused by dysfunction of effective ventilation in neonates due to the lack of pulmonary surfactant (PS), or the immature development of the lung [3,4].Because of the formation of hyaline membrane in the pathophysiology of this disease, it is also called neonatal pulmonary hyaline membrane disease [5].The disease causes a progressive worsening of inspiratory dyspnea.NRDS patients may experience rapid breathing, grunting sounds while breathing, and flaring nostrils, they may also have a bluish tint to their skin due to inadequate oxygenation [6].NRDS has a high morbidity rate, 5% of near-term infants are affected, 30% of infants who had a gestational age of less than 30 weeks are affected, and 60% of premature infants who had a gestational age of less than 28 weeks are affected [7].Many premature infants also die because of NRDS [8].Severe NRDS can lead to neonatal respiratory failure(NRF), which is defined as decreased oxygen saturation and oxygen partial pressure (PO2), or the need for endotracheal intubation and mechanical ventilation [9].NRF is likely to occur after NRDS for a period of time under the induction of various causes, affecting the development of children's circulatory system, nervous system, metabolism and other aspects, and even cause a serious impact on the prognosis of newborns [10].
At present, prenatal use of dexamethasone to promote fetal lung development and maturation [11], postpartum PS supplementation [12], and effective ventilation therapy [13] have reduced the incidence of NRDS, and also changed its severity and typical manifestations.However, NRDS remains the most common respiratory disease in preterm infants in the neonatal intensive care unit (NICU), and there are many cases of NRDS leading to NRF [14].Therefore, being able to identify the cases with a high probability of developing NRF in NRDS patients is helpful for early medical intervention, and is of great significance for improving the prognosis of children.
Predictive models have been previously developed for neonatal respiratory distress syndrome in both preterm and late-preterm infants, as well as for predicting other complications associated with NRDS [15,16].Nevertheless, a predictive model for respiratory failure within the context of neonatal respiratory distress syndrome has yet to be established.A newborn refers to an infant who is in the initial 28 days of life after birth.during this neonatal period, infants diagnosed with NRDS are at a high risk of developing NRF.As such, this study aims to investigate the likelihood of NRF occurrence among neonates diagnosed with NRDS at both day 1 and day 28 after birth and then establishing a predictive model for the development of NRF in NRDS.

Data source
This study was a restrictive observational study from the Medical Information Mart for Intensive Care IV (MIMIC-IV version 1.0) database (https:// physi onet.org/ conte nt/ mimic iv/1.0/), which is a large, freely accessible database of de-identified medical records for patients admitted to the intensive care unit (ICU) at the Beth Israel Deaconess Medical Center in Boston, Massachusetts, USA.It contains data from over 100,000 ICU stays between 2008 and 2019, making it one of the largest publicly available critical care datasets in the world [17].The MIMIC-IV database includes information on patient demographics, vital signs, laboratory results, medications, diagnoses, procedures, and other clinical data.It also contains free-text nursing notes and physician progress notes, which can be used for natural language processing and other text-based analyses.The MIMIC-IV database has been used for a wide range of research studies, including machine learning and artificial intelligence approaches for predicting patient outcomes, developing clinical decision support systems, and improving patient care.It has also been used to investigate clinical questions related to sepsis, acute respiratory distress syndrome, cardiac arrest, and other critical care conditions.Individuals who have finished the Collaborative Institutional Training Initiative examination (Certification number 50366200 for YL) can access the database.

Study population
In our study, we included neonatal patients with NRDS, and NRF secondary to the onset of NRDS.NRDS was determined following diagnostic codes from the International Classification of Diseases, 9th revised (ICD-9) and 10th revised (ICD-10) editions [18,19] and we defined cases with a PaO2 level below 50 mmHg as neonatal respiratory failure [20,21].We extracted these patients' parameters from the MIMIC-IV, and we collected the following data: basic information including gestational age, gender, ethnic group, admission time, onset time and discharge time.Then, biological variables were collected, including peripheral blood white blood cells (WBC), hemoglobin (Hb), platelets (PLT) from the blood routine examination; bilirubin from the blood biochemistry; pH, PO2, partial pressure of carbon dioxide (PCO2) from the blood gas analysis; blood culture and cerebrospinal fluid (CSF) culture results.All data were collected within 48 h of patient admission, and in cases with multiple measurements, we analyzed only the initial measurements.The clinical variables mainly included intrauterine growth retardation (IUGR), neonatal asphyxia, neonatal apnea, neonatal jaundice, neonatal intracranial hemorrhage, neonatal coagulation disorders, neonatal pneumonia, neonatal anemia and infection.Treatment measures included whether or not to use PS, whether or not to use noninvasive ventilation, whether or not to use caffeine, and whether or not to use parenteral nutrition.The code of data extraction is available on Github (https:// github.com/ MIT-LCP/ mimic-iv).

Statistical analysis
For nomogram construction and validation, we randomly divided all the NRDS patients into training and validation cohorts, in a ratio of 7:3 [22].The demographic and clinical characteristics of the patients were described in the training and validation datasets.Univariate Cox and stepwise Cox regression analysis were used to screen variables.P values of less than 0.05 (P < 0.05) in univariate Cox regression analysis were included in the multivariate Cox proportional hazards regression analysis.To simplify the model and prevent collinearity of variables, multivariate Cox proportional hazards regression analysis was performed to identify variables that significantly affected the onset of NRF, using a significance threshold (P < 0.05) [23].These eligible variables were included in the final Cox proportional hazards model, and the corresponding nomogram was drawn.The predicted values of the nomogram were calculated, and the actual values observed were compared with the results of the nomogram.The calibration curve [24], receiver operating characteristic (ROC) curve [25] and decision curve [26] were drawn to test the performance of the model.All statistical analyses were conducted using R 4.2.1 (https:// www.r-proje ct.org/).In the R software package used, TableOne (0.13.2) was used for data description, survival (3.2.13) was used for feature selection, and RMS (6.2.0) was used for model construction and nomogram drawing.Bilateral P < 0.05 was considered to indicate statistical significance.

Patient characteristics
A total of 2705 patients diagnosed with NRDS between 2008 and 2019 were included in this study, and NRF was observed in 1194 (44.1%) of them.The training and validation cohorts of NRDS patients consisted of 1899 and 806 cases, respectively.In the total cohort of NRDS patients, the majority of patients were white (30.7%) and male (57.6%).Patients with infection accounted for 16.1% and 17.5% of those in the training and validation cohorts, respectively, while patients with IUGR accounted for 8.1% and 7.0%, and those with neonatal asphyxia accounted for 0.4% and 0.1%.From the laboratory test results, the median pH in both cohorts were 7.29 [7.24, 7.34 1.And there was no significant statistical difference between these variables in the training and validation cohorts (P > 0.05).

Nomogram construction
We developed a nomogram predicting the occurrence of NRF at day 1 and day 28 in patients with NRDS, based on the selected pathogenic factors from the training cohort (Fig. 1).Each level of every variable was assigned a score based on the points scale.The total score was obtained by adding the scores of each of the selected variables.The prediction corresponding to this total score then helped in estimating the occurrence of NRF within day 1 and day 28 for each NRDS patients.

Nomogram validation
We detected the ability to predict NRF in NRDS patients from the nomogram.Figure 2 indicates that the area under the ROC curve (AUC) values of the nomogram were 0.9343 (Fig. 2A) and 0.9378 (Fig. 2B) for the occurrence of disease within day 1-and day 28-in the training cohort, respectively, and in the validation cohort, the AUC values of the nomogram were 0.9237 (Fig. 2C) and 0.9321 (Fig. 2D).It shows that our model has good predictive ability in both the training and validation cohorts [27].Figure 2 also displays the calibration curves of the nomogram.The calibration curves of the training (Fig. 2E/F) and validation (Fig. 2G/H) cohorts indicate that the nomogram provided a good fit to the data, and that our models did not significantly overestimate or underestimate risk [28].Finally, we drew a decision curve analysis to illustrate the clinical applicability of the nomogram (Fig. 3).It indicated that clinical interventions guided by our nomogram had a high net benefit [26].

Discussion
NRF secondary to NRDS is not uncommon, it may occur after NRDS for a period of time after the onset of NRDS, especially when combined with multiple risk factors.We performed a large sample multi-risk factor analysis, and indicated Gestational age < 28 weeks, pH, PO2, PCO2, Hb, Blood culture, infection, Neonatal intracranial Hemorrhage, PS, parenteral nutrition and respiratory support as independent risk factors for NRF in NRDS patients.These results were used to construct a nomogram for estimating the NRF risk in NRDS patients within day 1 and day 28 during hospitalization.The validity of our nomogram model was determined using multiple indicators, including AUC, calibration curves and decision-curve analysis.In this study, we constructed a more comprehensive model based on a combination of various risk factors, to better predict the risk of NRF in patients with NRDS.
We found that most of the secondary NRF in NRDS patients occurred within one day [29].This is also consistent with the clinical features of NRDS, which is a progressive worsening of dyspnea that develops gradually after birth, therefore most NRDS patients typically develop respiratory failure within 1 day.Premature infants with a gestational age of less than 28 weeks are at an increased risk of developing NRF following NRDS.This is primarily due to the fact that premature infants exhibit underdeveloped lungs, insufficient production of surface-active substances, and compromised immunity, which collectively increase the likelihood of disease progression and exacerbation2.In addition, we found that infection-related factors were also closely related to neonatal respiratory failure secondary to NRDS, including clear presence of infection-related symptoms, or positive microbial tests such as blood culture and CSF culture, which may be due to the decreased activity and increased degradation of PS caused by inflammatory mediators [30].At the same time, inflammation can cause mechanical damage to type II alveolar epithelial cells, and further reduce the secretion of PS [31].Thus, patients with pathogen cultures detected during the first time should receive clinical attention.Antimicrobial agents should include all possibly present pathogenic bacteria in the initial stage of anti-infective therapy.
In terms of treatment, parenteral nutrition increases the risk of NRF, which may be associated with infection due to parenteral nutrition, or increased pulmonary circulation due to excessive fluid intake [32].Therefore,   Blood gas analysis is an important laboratory test index in neonatal respiratory management.Our study found that pH, PO2 and PCO2 are of great importance to NRF [10].These indicators can not only reflect the occurrence of NRF, but also be used as risk factors to early judge NRF secondary to NRDS, and remind us to carry out early intervention.Our findings revealed a significant association between reduced hemoglobin levels and disease development, potentially attributed to inadequate oxygenation among anemic children.Furthermore, the impact of intracranial hemorrhage on disease onset may be related to the central nervous system's role in respiratory regulation.
Clinical predictive models can be used to study the relationship between future outcome events and baseline status in patients [34].They can integrate the results of traditional analyses, simplify them with more intuitive and convincing presentations, and predict the probability of certain outcome events with a scoring system [35].NRDS is the most common respiratory disease in preterm infants.NRF caused by NRDS can be followed by multiple organ dysfunctions, which has a great impact on the prognosis of preterm infants.At present, the risk factors of respiratory failure secondary to NRDS have not been well studied.Therefore, the establishment of this prediction model has important clinical significance for early identification of NRF in patients with NRDS.Our doctors can use the scoring results of the model to communicate with the family members of the neonate, help them understand the severity of the child's condition, work out a treatment plan together, improve the degree of cooperation, and prevent the occurrence of NRF to the greatest extent.However, the predictive ability of this nomogram may be improved by considering other potential important factors that we were not able to obtain from the MIMIC-IV database, such as maternal factors during pregnancy, perinatal medication and detailed insights into the parameters associated with non-invasive ventilation.And although the number of patients included was large, this study is a single-center study, and lacks external validation.

Conclusion
This study identified the independent risk factors of respiratory failure in NRDS patients and used them to construct and evaluate respiratory failure risk prediction model for NRDS.The present findings provide clinicians with the judgment of patients with respiratory failure in NRDS and help clinicians to identify and intervene in the early stage.
rational parenteral nutrition and fluid management are critical in patients with NRDS.At the same time, the use of noninvasive ventilation and Surfactant replacement can effectively reduce the occurrence of NRF.Noninvasive ventilation techniques, like nasal Continuous Positive Airway Pressure (nCPAP), offer positive end-expiratory pressure to NRDS patients.This aids in consistently expanding the alveoli, enhancing gas exchange, and subsequently mitigating the risk of NRF.As the respiratory distress in NRDS patients stems from a PS deficiency, replenishing PS further reduces the likelihood of NRF [33].

Fig. 1 Fig. 2 Fig. 2 (
Fig.1This nomogram estimates the likelihood of neonatal respiratory failure (NRF) in patients diagnosed with neonatal respiratory distress syndrome (NRDS).When using the nomogram, draw a vertical line from each variable to the points scale, noting the corresponding score, and then sum the scores for all variables to get a total.Finally, refer to the bottom of the nomogram to determine the predicted probability of NRF based on the total score.For comorbidities, 'Yes' indicates the presence and 'No' indicates the absence of the condition.For laboratory test results, 'Neg' stands for negative, and 'Pos' for positive.For treatment measures, 'Yes' indicates the measure was applied, while 'No' means it wasn't.PCO 2 , partial pressure of carbon dioxide; PO 2 , partial pressure of oxygen; Hb, hemoglobin

Table 1
Characteristics in the study about patients with NRDS IQR Interquartile range, NRF Neonatal respiratory failure, PCO 2 Partial pressure of carbon dioxide, PO 2 Partial pressure of oxygen, WBC White blood cells, Hb Hemoglobin, PLT Platelets, CSF Cerebrospinal fluid, IUGR Intrauterine growth retardation, PS Pulmonary surfactant

Table 2
Univariate and multivariate Cox regression analysis based on all variables for neonatal respiratory failure in NRDS patientsPCO 2 Partial pressure of carbon dioxide, PO 2 Partial pressure of oxygen, WBC White blood cells, Hb Hemoglobin, HR Hazard ratio, PLT Platelets, CSF Cerebrospinal fluid, IUGR Intrauterine growth retardation, PS Pulmonary surfactant